Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning
نویسندگان
چکیده
This article deals with distributed policy optimization in reinforcement learning, which involves a central controller and group of learners. In particular, two typical settings encountered several applications are considered: multiagent learning (RL) xmlns:xlink="http://www.w3.org/1999/xlink">parallel RL , where frequent information exchanges between the learners required. For many practical systems, however, overhead caused by these communication is considerable, becomes bottleneck overall performance. To address this challenge, novel gradient approach developed for solving RL. The adaptively skips during iterations, can reduce without degrading learning It established analytically that: i) algorithm has convergence rate identical to that plain-vanilla gradient; while ii) if heterogeneous terms their reward functions, number rounds needed achieve desirable accuracy markedly reduced. Numerical experiments corroborate reduction attained compared alternatives.
منابع مشابه
Policy Gradient Methods for Reinforcement Learning with Function Approximation
Function approximation is essential to reinforcement learning, but the standard approach of approximating a value function and determining a policy from it has so far proven theoretically intractable. In this paper we explore an alternative approach in which the policy is explicitly represented by its own function approximator, independent of the value function, and is updated according to the ...
متن کاملExponentiated Gradient Methods for Reinforcement Learning
This paper introduces and evaluates a natural extension of linear exponentiated gradient methods that makes them applicable to reinforcement learning problems. Just as these methods speed up supervised learning, we nd that they can also increase the ef-ciency of reinforcement learning. Comparisons are made with conventional reinforcement learning methods on two test problems using CMAC function...
متن کاملGradient Sparsification for Communication-Efficient Distributed Optimization
Modern large scale machine learning applications require stochastic optimization algorithms to be implemented on distributed computational architectures. A key bottleneck is the communication overhead for exchanging information such as stochastic gradients among different workers. In this paper, to reduce the communication cost we propose a convex optimization formulation to minimize the coding...
متن کاملScalable Multitask Policy Gradient Reinforcement Learning
Policy search reinforcement learning (RL) allows agents to learn autonomously with limited feedback. However, such methods typically require extensive experience for successful behavior due to their tabula rasa nature. Multitask RL is an approach, which aims to reduce data requirements by allowing knowledge transfer between tasks. Although successful, current multitask learning methods suffer f...
متن کاملModel-based Policy Gradient Reinforcement Learning
Policy gradient methods based on REINFORCE are model-free in the sense that they estimate the gradient using only online experiences executing the current stochastic policy. This is extremely wasteful of training data as well as being computationally inefficient. This paper presents a new modelbased policy gradient algorithm that uses training experiences much more efficiently. Our approach con...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Control of Network Systems
سال: 2022
ISSN: ['2325-5870', '2372-2533']
DOI: https://doi.org/10.1109/tcns.2021.3078100